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Control of nucleic acid amplification procedures 
including the polymerase cliain reaction (PCR) 

The present invention relates to the optimisation of cycling 
s conditions used to control polymerase chain reactions . 

Optimisation^ of the temperature control used for PCR 
amplification requires careful consideration of reaction 
- conditions. The complex nature of the reaction and the 

10 interactions between essential reaction components means that 
traditional kinetic methods of analysis cannot be readily 
applied to predict optimum cycling conditions. The process 
described here overcomes these problems by predicting the 
level of amplification using a novel combination of "grey- 

15 box 11 modelling, genetic algorithms and neural networks, to 
model and predict the level of amplification for defined sets 
of cycling conditions . This can be used to determine which 
parts of the temperature profile have greatest effect on the 
reaction. Genetic algorithms are used to model the effects 

20 changes in the temperature profile have on amplification. 
These algorithms can then, be used to define temperature 
Cycles that give increased reaction performance . Linking this 
modelling process with on-line monitoring of the 
amplification process, real-time optimisation of reactions is 

25 possible. This is of particular importance to quality control 
sensitive procedures such as PCR-based diagnostics. 

US Patent No 4683195 (Mullis et al . , Cetus Corporation) 
discloses a process for amplification of . nucleic acid by the 
30 polymerase chain reaction (PCR) . Short oligonucleotide 
sequences usually 10-40 base pairs long are designed to 
flanking regions either side of the target sequences to be 
amplified. These primers are added in excess to the target 
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sequence DNA. A suitable buffer, magnesium chloride ions, a 
thermostable polymerase and free nucleotides are also added. 

A process of thermal cycling is used to amplify the DNA 
typically several million-fold. Amplification is facilitated 
through cycling temperature. The target DNA is initially 
denatured at 95 °C and then cooled to generally between 40 °C 
to 60 °C to enable annealing of the primers to the separated 
strands. The temperature is raised to the optimal temperature 
of the polymerase, generally 72 °C, which extend the primer to 
copy the target sequence. This series of events is repeated 
(usually 20 to 40 times) . During the first few cycles, copies 
are made of the target sequence. During subsequent cycles, 
copies are made from copies, increasing target amplification 
exponentially . 

Describing PCR mathematically may not be possible using 
traditional kinetic notation because of the complex 
interactions between reaction components. (see^A simple 
procedure for optimising, the polymerase chain reaction (PCR) 
using modified Taguchi methods" Cobb and Clarkson, 
(1994) Nucleic Acids Research. Vol.22, No . 18 , pp . 3801-3805) . 
Mg~"and deoxynucleotide triphosphates have been shown to 
affect the efficiency of priming and extension by altering 
the kinetics of hybridisation and disasscciation of primer- 
template duplexes at denaturing, annealing and extension 
temperatures. These components, are also involved in altering 
the efficiency with which the polymerase recognises and 
extends such duplexes. Concentrations of Mg" and 
deoxynucleotide triphosphates required . for optimal 
amplification depends largely on the target and primer 
sequences, with the nucleotides at the 3* end of the primer 
having a major effect on the efficiency of mismatch 
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extension. Certain mismatch nucleotide combinations may be 
amplified more efficiently under certain reaction conditions 
that others. The presence of excess Mg~~ in a reaction may 
result in the accumulation of non-specific amplification 
products, and insufficient concentrations reduce product 
yield. In addition, deoxynucleotide triphosphates 

quantitatively bind Mg"" ions, so that any modification in 
dNTP concentration requires a compensatory adjustment of 
MgCl, . 

PCR optimisation conventionally requires repetitive trial - 
and-error adjustment of important reaction parameters. 
Reactions optimised in this way are generally not robust and 
are susceptible to small variations in the temperature 
profile and/or minor fluctuations in the composition of the 
reaction mixture. The complexity, and to a certain degree the 
uncertainty of the reaction, means ' that modelling is 
difficult. Where models have been proposed see: ^Polymerase 
chain reaction engineering" Hsu et al . , ( 1997 ). Biotechnology 
and Bioengineering, Vol 55, No . 2 . pp. 359 -3 66, important 
reaction elements have been ignored. Importantly, current 
models assume that ^denaturation, extension and annealing 
occur at fixed temperatures in the cycle, predominantly due 
to the way in which thermal cyclers are programmed with fixed 
temperatures for each of these principle events. However, 
this is an over simplification since the rate of these events 
is temperature dependent such that they occur over a wide 
temperature range . 

Theoretically, amplification of specific template sequences 
should" have an exponential function, i.e. under perfect 
conditions the amount of template amplified will double after 
each cycle of the reaction. However, the fidelity and rate of 
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amplification is controlled by a complex interaction between 
the reaction components so that the theoretical optimum is 
never achieved. Under normal conditions, the accumulation of 
product becomes limited during the later cycles since the 
number of duplexes for extension exceeds the enzyme activity 
in the reaction. At this point, the accumulation of product 
becomes linear. This is compounded by thermal inactivation of 
the polymerase with prolonged exposure to temperatures in 
excess of 80°C. Amplification can be optimised by careful 
consideration of annealing- temperatures , annealing times and 
annealing ramps. It is possible to increase the annealing 
temperature to avoid non-specific priming, by adjusting the 
ramp rate in order . to compensate for the reduced rate of 
priming. This will increase the cycle range during which 
exponential accumulation of the target sequence occurs . The 
rate of priming and temperature range over which priming 
occurs will depend on the amount of free Mg* . 

Similar optimisation of denaturation times and ramps will 
have an impact on amplification since Tag polymerase becomes 
denatured with excessive exposure to tlie high denaturation 
temperatures (typically > 94°C for Imin to 5min) (see: 
^Kinetics of inactivation for thermostable DNA polymerase 
enzymes" Mohapatra and Hsu, (1996), Biotechnology Techniques, 
Vol.10, pp. 569-572) . Although polymerase is normally added in 
excess, successive denaturation steps in the PCR have 
significant impact on the . amount of polymerase denaturation. 
Additionally, these temperature conditions cause depurination 
of the DNA template .(typically every 2Kb "at 94°C mixf"). Since 
denaturation occurs before and after the set denaturation 
temperature has been achieved (typically DNA denatures with 
increasing velocity between 70°C and 90°C), modification of 
ramp times can be used to limit the times at 94°C. 
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Polymerases such as Tag have been well characterised. They 
show classic temperature dependency, with a gradual increase 
in extension rate at high temperatures. Activity reaches an 
optimum (typically = 70°C), after which activity drops 
sharply (typically > 80°C) . Extension will occur over an 
extended temperature range. It is possible to reduce 
extension times by consideration of the total amount of 
extension over this range. For example, a significant amount 
.of extension will occur at ca. 60°C. Oligonucleotides that 
hybridise at this temperature will be extended immediately. 
Extension times can be reduced, or in some instances 
eliminate altogether. 

The present invention seeks to provide optimisation of 
cycling conditions used to control polymerase chain 
reactions. 

According to an aspect of the present invention there is 
provided a method of optimising the cycling conditions used 
to control a polymerase chain reaction as specified in claim 
1 . 

The preferred embodiment provides a process which allows 
intelligent control of the PCR- This is achieved by modelling 
and predicting levels of amplification through a novel 
combination of membership function assignment (association of 
reaction events with temperature) , generic algorithms and 
artificial neural networks. Here, the membership component 
infers and provides a crisp- definition for the various 
reaction parameters that determines the degree of 
amplification for a specific reaction. Genetic algorithms are 
used to determine the optimum times for each step of 
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temperature cycle. The neural network component is then used 
to enhance the membership rules and membership functions. 
After an initial training, the neural network can be used to 
update the membership functions as it learns more from its 
input signals. This process may be used to accurately predict 
optimum reaction conditions (Figure 1) . 

Preferably, the process is used to transfer protocols from 
one thermal cycler to another, wherein the relative 
contributions of denaturation, annealing and extension are 
first calculated taking into account the thermal performance 
of the source cycler, and then transferred to the target 
cycler by taking into account the differences in cycler 
performance . 

An embodiment of the present invention is described below, by 
way of example only, with reference to the accompanying 
drawings, in which: 

Figure 1 is a schematic representation of intelligent ' \ 
thermal cycler control using membership functions for 
specific reaction events and genetic algorithms to predict 
levels of amplification for set cycling conditions; 

-. 

Figure 2 shows the prediction of amplification levels using a 
self -learning control process; 

Figure 3 shows a sigmoidal membership function for template 
destination- 
Figure 4 shows examples of modelling of PCR amplification 
using membership functions for destination, annealing and 
extension; 
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Figure 5a is a schematic diagram of the key events in the 
preferred genetic algorithm; 

5 Figure 5b shows the operation of the genetic algorithm of 
Figure 5a; 

Figures 6a and 6b show, respectively, examples of one-point 
and two-point cross-over; 

10 

Figure 7 is a schematic representation of a mode in an 
artificial neural network and a three layer artificial neural 
network; 

15 Figure 8 shows a sigmoid transfer function with sigmoid gains 
of 0.25 to 2.00; and 

Figures 9 to 15 show the results relating to the preferred 
method . 

20 

New technologies for monitoring the progress of PCR in real- 
time (e.g. fluorogenic 5' nuclease chemistries - PE Applied 
Biosystems and ethidium bromide fluorescence (see ""Kinetic 
PCR analysis Real-time monitoring of DNA amplification 

25 reactions' 1 Higuchi et al . , (1993), Biotechnology. 
Vol. 11, pp. 1026-1030) have been recently described. Although 
these methods allow the quantification of product formed 
during the course of a reaction, the main benefits of real- 
time monitoring will come from algorithms chat can accurately 

30 predict and maintain optimal amplif icar ion . The process 
described in this application can be used in conjunction with 
product detection systems to provide real-time dynamic 
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control of PCRs ; continuously updating cycling conditions as 
the reaction proceeds, maintaining optimum performance. 

PCR conditions generally need careful consideration in order 
to optimise the level of amplification. However, since the 
polymerase chain reaction is complex, traditional kinetic 
methods of analysis cannot easily be used to predict optimum 
conditions. This is compounded by the complex interaction 
between the reaction components. The preferred embodiment 
seeks to overcome the problems traditionally associated with 
PCR optimisation by predicting the level of amplification by 
calculating the amount of denaturation, annealing and 
extension over the entire range of temperatures in the 
cycling profile. Using a novel combination of ^grey-box' ' 
modelling that applies membership functions to each of these 
events, genetic algorithms and artificial neural networks, 
the level of amplification can be predicted. Analysis of the 
weights generated by the neural network can then be used 
define reaction optima in terms of the time taken for each of 
the cycling events. The process can be described by the 
following; 

i. Grey-box modelling is used to define optimum annealing, 
extension and denaturation temperatures and the 
temperature range over which these evenrs occur. 

ii . Membership functions are applied to each of these 
events to predict level of amplification over a given 
cycle . 

iii. Genetic algorithms are used to determine optimum times 
for each of these stages. 

iv. Neural networks may be used to confirm and/or modify 
predicted level of amplification. 
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v. Real-time monitoring may be used to refine the process 
further . 

This approach provides the basis of a PCR specific control 
5 software that can be applied to standardisation thermal 
cycler control and is the first description of an intelligent 
control process for PCR optimisation. 

Defining membership functions 

10 The basis of the preferred process involves converting 
important elements of the PCR (denaturation , annealing and 
extension), into a series of membership functions. "Grey- 
box 1 1 modelling of the reaction is initially used to generate 
the series of membership functions and rules for the various 

15 parameters that effect reaction performance; i.e. factors 
that influence template denaturation, primer annealing and 
extension of the primer /template duplex. A rule base is used 
to associate these events with specific temperatures in the 
amplification cycle. Crisp values for each reaction variable 

20 are used to predict the level of amplification over single or 
multiple cycles. By using membership functions it is possible 
to model dynamically the PCR and take into account changes in 
the rate of the various processes over the whole temperature 
cycle and not at specific stages as with conventional 

25 modelling methods. 

PCR can be considered to comprise three principle 
events; namely, denaturation of the double stranded 
template, annealing of the primers to the single 
30 stranded denatured template, and polymerisation of 
these duplexes. The rate at which these processes 
occur is temperature dependent. Interactions between 
various elements in the reaction' are additional 
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regulators of rate and optima. ^Grey-box 11 modelling 
of these processes allows membership functions for 
these events to be associated to specific temperatures 
(Figure 2) . Genetic algorithms can be used to modify 
the time allocated for each stage (tl..:t6) to 
optimise amplification and limit the effects of 
component -interact ions, inactivation , depurination , 
etc. The neural network element is. used to learn what 
initial times to use in order to reduce the time taken 
to calculate the stage optima. 

Denaturation membership functions 

Denaturation of DNA template . can be represented by a 
sigmoidal melting curve. The annealing temperature of that 
template defines the temperature at which half the DNA is 
denatured. Little denaturation occurs below 70 °C. Increasing 
the temperature above this results in a marked increase in 
the rate of denaturation. These high temperatures are also 
associated with an increase in the rate of enzyme 
inactivation and depurination of the template.. Principally, 
denaturation occurs over a range of temperatures during 
cycling; i.e. during ramping up to, ramping down from, and 
during the assigned denaturing temperature . Denaturation 
times may be significantly reduced by calculating the .rate- of 
denaturation before, during and after the specified 
denaturation temperature has been attained. This minimises 
the level of enzyme inactivation and the amount of template 
depurination. Consequently, the efficiency of amplification 
can be increased. 

Co-operative interactions between the stacked hydrogen-bonded 
base pairs of the double stranded template are progressively 
disrupted during melting until the two strands are" completely 
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separated. Membership functions (M denature ) for template 
denaturation can therefore be defined by a sigmoidal curve 
(Figure 3) with gain q denacure and midpoint s (x) denature that is 
defined by predicted T ra and optimum denaturation temperature 
of the template molecules. Gain q deMture and midpoint s(x) denature 
are used as modifiers of the neural network component. 
Generalised curves may also be used to define memberships for 
denaturation {e.g. boolean, trapezoidal, triangular and 
kinetic) . Typically, native DNA with AT and GC base pairs 
have T TO 's of 72°C. Poly (AT) templates have T m ' s ca. 60°C and 
poly (GC) templates have T m ' s ca. 90 °C. 

Annealing- memJbersiiip functions 

Annealing membership defines the rate of hybridisation of 
primer to template. Initially the priming temperature T p (or 
T m - 5°C) is used to define optimum hybridisation depending on 
-the length of primer. Typically, T p and T m are calculated from 
one of the following equations; 

• T p = [22+1.46 • (2 • GC + AT)] 

• T m = 81.5 + 16.6 • {(log 10 [J*]) + 0.41 • (%G+C) - (600/1) - 
0 . 63 ( %FA) } 

• T TO = 81.5 + 16.6 - {(log 10 [J*]) + 0.41 (%G+C) - 675/2}* 

• T m = [ (A+T) • 2°C] + [ (G + C) * 4°C] 

where J~ + is the concentration of monovalent cations, 1 is the 
oligonucleotide length and FA is formamide. Membership 
functions for hybridisation (M anneal ) are represented by 
sigmoidal curves (Figure 3) whose maxima are defined by the 
primers T p or T m - 5°C. Generalised curves may also be used to 
define memberships (e.g. boolean, trapezoidal, triangular and 
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kinetic) . The temperature ranges over which annealing occurs 
will depend on the concentration of free magnesium ions in 
the reaction mixture. Midpoint s (x) anMai and gain qr a „„„ a1 define 
the temperature range over which annealing takes place and 
are strongly associated with the concentration of free Mg 2+ 
ions in the reaction. Gain and midpoint s (x) m9al are 

used as modifiers of the neural network component. 

Extension menLbersiiip functions 

The dependency of Tag activity on temperature has previously 
been described in detail by other workers (see 
"Deoxyribonucleic acid polymerase from the extreme 
thermophile THERMUS aquaticus" , Chien et al . , (1976). Journal 
of Bacteriology. Vol. 127, No . 3 , pp . 155 0 - 1557 ) . Membership 
functions for the polymerisation of duplexes is represented 
by a curve whose maximum is defined by the temperature 
optimum. This will depend on the specific polymerase used and 
is generally in the region ca . 70 °C. Generalised curves may 
also be used to define memberships (e.g. boolean, 
20 trapezoidal, triangular and bell) . 

Figure 4 shows the values of reaction memberships M denacure , 
^eai and M exZengion over a single PCR cycle for sample tube 
temperatures at time t seconds. Adjustment of time t between 

25 and during these cycle events can be used to optimise the 
amount of amplification achieved. Factors reducing 
amplification (e.g. depurination, inactivation of Tag 
polymerase, etc) are reduced through the optimisation process 
since optimisation presupposes a reduction in these events. 

30 Neural net algorithms are used to determine which of these 
events have greatest influence on the outcome of the 
reaction. 
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Determination of optimum profile times using genetic 
algorithms 

Since changes in any one of the times set for denaturation, 
annealing and extension will alter the level of 
amplification, strategies for reaction optimisation are not 
obvious or computationally intensive. The process described 
here uniquely uses genetic algorithms to overcome these 
problems. A population of potential solutions to the problem 
are maintained and repeatedly updated according to the 
principles of evolution, i.e. selection, mutation and/or 
recombination (crossover) . Recombination selects pairs of 
solutions in the population (parents) and generates a series 
of new solutions (children) by combining elements from their 
parents, which are then inserted into a new population of 
solutions. The principle of selection demands that "good 11 
solutions be preferentially chosen over ^bad 1 ' solutions. 
This is achieved by defining a fitness function that assigns 
a number to each solution according to how "good' 1 it is. 
Selection guarantees that only solutions (chromosomes) with 
best fitness will propagate in future populations. Mutation 
is used modify solutions in the population without 
interaction with the rest of the population. Genetic 
algorithms therefore search for sets of alleles which, 
together, produce good solutions (co-adapted alleles) . 

A general definition of genetic algorithms does not exist. 
They can be represented by an abstraction of the work of 
Holland (See ^Adaptation in natural and artificial systems''. 
Holland, (1975). The University' of Michigan Press; Ann 
Arbor) -. A schematic of the genetic algorithm used in this 
process . is given in Figure 6 and is represented by the 
following steps; 
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i. Initialisation. The initial population of chromosomes 
(typically n = 25... 100), is created either randomly, or 
by ^shuffling 1 ' an input chromosome generated from the 
artificial neural network element described later. This 
reduces the time taken by the algorithm in searching for 
optima. 

ii. Evaluation. The fitness of each chromosome is 
evaluated. In this case, the amount of amplification is 
calculated using the membership functions described 
earlier and the cycle times assigned by individual 
chromosomes. These are then assigned fitness functions, 
which numerically encodes the performance of each 
chromosome . 

iii. Exploitation. Chromosomes with the highest fitness 
score (i.e. the highest predicted amplification for the 
shortest step time) , are placed one or more times into a 
mating subset in a semi -random fashion. Two chromosomes 
are drawn at random from the population. The chromosome 
with the highest fitness score is placed into a mating 
subset . Both chromosomes are returned to the population 
and the selection process is repeated until the mating 
subset is full . This ensures that chromosomes with low 
fitness are removed from the mating population. The chance 
of selection as a parent is proportionate to chromosomes 
normalised fitness. This means that better chromosomes 
will generally produce more children. However, the 
stochastic nature of the process means that occasionally 
poor solutions will produce children. The addition of 
elitism functions is also beneficial. Here, the single 
best solution in any parent generation is copied 
unmodified into the child generation; all other children 
are generated as normal . 
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XV. Exploration. This uses recombination and mutation 
operators to modify chromosomes in the next generation of 
chromosomes (Figure 5) . Two chromosomes from the mating 
subset are randomly selected and mated. The probability 
of mating is a controllable function and is generally 
given a high value 0.90) . A recombination operator is 
used to -exchange genes between the two parent 
chromosomes, to produce two children. In Figure 5, a 
population of three individuals is shown. Each is 
assigned a fitness function F. On the basis of these 
fitnesses, the selection phase assigns the first 
individual O copies, the second 2 copies and the third 1 
copy. After selection genetic operators are applied 
probablistically ; the first individual has its first bit 
mutated from a 1 to a 0 , and crossover combines the . 
second two individuals into new ones . (Modified from 
Forrest (1993). Science, Vol.261, pp. 872-878). 

Recombination may involve one-point and/or two-point 
crossovers (Figure 6) . Where one-point crossover is used, 
a crossover point is selected along the chromosome and the 
genes up to that point are swapped between the two 
parents. Where two-point crossover occurs, two crossover 
points are selected and the genes between the two points 
are swapped. Other crossover algorithms such as partially 
mapped crossover (PMX) , order crossover (OX) and cycle 
crossover (CX) may also be used. All children are produced 
by crossover of two parents; two children are produced 
simultaneously. The children then replace the parents in 
the next generation.. Mutation of single genes is another 
controllable function. Mutation rates are generally 
assigned a low probability 0.001), and can be 
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defined (N*L 5 ) , where N is the population size and L is • 
the length of chromosome. 

V. This sequence of events is repeated for a fixed number 
of generations until convergence occurs . Crowding 

replacement may be used to reduce the problem of premature 
convergence. A child is compared with a number of existing 
parents and replaces a parent most similar to itself. 
Essentially crowding replacement replaces like with like, 
this allows sub-populations to explore various parts of 
the genetic search space. This is particularly useful for 
multi-modal search spaces where there are a number of 
scattered fitness maxima. 

The basic structure of the genetic algorithm used can be 
formalised as; 

Initialise population 
Evaluate chromosomes 
Calculate normalised fitness 
" Output statistics on population 

For each successive generation 

{ " 

Generate and evaluate candidate replacements 

Replace numbers of the populations with candidates 

Re -evaluate unchanged members 

Normalise fitness 

Output statistics on population 

y 

Each time in the cycle profile (Allele; t : ...t. s ) is converted 
into bits I = {0;l}" according to the min/max time allowed for 
specific stages in the cycle, and the required time 



WO 99/18516 



PCT/GB98/02989 



17 

resolution (Table These are then grouped together to form 

bit strings (chromosomes) representing the total time profile 
of a single cycle. Initially a random population of strings 
is created according to the minimum and maximum times allowed 
for each stage. The neural network may be used after training 
to set initial times according to the membership functions 
assigned for denaturation, annealing and extension steps. 
(Note: Traditional binary encoding has a draw, back in that in 
certain instances all bit must be changed to alter the number 
by 1. This makes in difficult for an individual that is close 
to an optimum to move closer by mutation. The use. of Gray 
codes is preferred since incrementing or decrementing any 
number by 1 is always a one-bit change.) 

Table l: Bit data encoding t^, the time taken for the thermal 
cycler to heat from annealing (72°C) to denaturation (94°C) . 
The maximum allowable time is set at 120 seconds (5 °C sec"" 
ramp) . 





Bit: Size 


| Bit; | 


. . .fcg> 






























:;3::. : -;a3eC;:,;. 


7. .bit 




ixix 


42 bit 








ooo 




j 2 sec 


6 bit 




1111 1 
1 oo 


36 bit 


J S . sec . 


NMNi 




1100 


30 bit 








0 




\ 10 


4 bit 




1100 


24 bit 


\ sec 











The minimum times for each event in the cycle are fixed by 
the performance of the thermal cycler. Generally, minimum 
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'ramp rates are set between 0.5. °C see" 1 and 1 °C sec"'. Maximum 
ramp rates generally do not need to be set higher than 5 °C 
sec" 1 . The maximum time for cycle events will therefore depend 
on the ramp rate - and the temperature difference between 
successive steps . 

Amplification is modelled using the time profiles designated 
by each chromosome in the population and the membership 
functions described earlier. Fitness is scored based on 
maximum amplification achieved using the shortest time 
possible, and in maintaining minimum exposure to extremes of 
the temperature profile which may promote misprinting, 
inactivation and depurination . Under certain circumstances,, 
it may be beneficial to define other fitness characteristics 
(e.g. with RAPDs, differential display, etc it may be 
necessary to promote misprinting events) . Once all individuals 
in the population have been evaluated, their fitnesses are 
used as the basis of exploitation and exploration to select 
chromosomes for subsequent rounds of selection. After a 
specified number of iterations, the optimum time profiles are 
highly represented in the chromosome population. 

Neural network selection 

This application uses a feed-forward neural network with ' an 
input layer, one or more hidden layers and an output layer 
(Figure 7). Each node has a series of weighted inputs w it 
representing external signals or the output from other nodes. 
The number of hidden nodes is an adjustable parameter. The 
sum of the weighted inputs is transformed using a non-linear 
sigmoidal transformation function; 
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where f (x) has the range 0 to 1, x is the weighted sum of the 
inputs, and q is the gain. Modification of q alters the shape 
of the curve. A small value for q gives the sigmoidal 
function a steep slope (e.g. q = 1.0), Conversely, a large 
value for q gives the curve a shallow slope (e.g. q = 2.0). 
Inputs represent transition times for temperature-controlled 
events and the temperature ranges of those events (Figure 8) . 
There is one input node per variable. The input nodes 
transfer the weighted input signals to the nodes in the 
hidden layer. A connection between node i in the input layer 
and node j in the hidden layer . is represented by the 
weighting factor w j± . Hence, there is a vector of weights, w jt 
for each of the J nodes in - the hidden layer. These weights 
are adjusted during the training process. Each layer also has 
a bias input to accommodate for non-zero offsets in the data. 
The value of the bias is always set to zero. A term is 
included in the vector of weights to connect the bias to the 
corresponding layer. This weight is also adjusted during the 
training period. Other functions (Tan h, sin, cosine, linear, 
etc) may be used. 

During an initial training procedure , a series of input 
patterns with their corresponding expected output are 
presented to the network in an iterative fashion while the 
weights are adjusted. This is continued until the desired 
level of perception between expected observed outputs has 
been achieved. Different learning algorithms can be used, 
however back-propagation is currently the algorithm of 
preference. The error in the expected output is back 
propagated through the network using the generalised delta 
rule to determine the adjustment to the weights (see 
"Parallel distributed processing explorations in the 
microstructure of cognition". Part 1. Rumelhart and 
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McClelland, (1986). MIT Press: Cambridge, MA). The output 
layer term is given by; 

where is the error term for the observation p at output 

node k, t pk is the expected output for bbservation p, and o pk 
is the actual node output. o pk (l-o pk ) is a derivative of the 
sigmoidal function. The error term at node j of the hidden 
layer is the derivative of the sigmoid function multiplied by 
the sum of the products of the output error terms and the 
weights in the output layer; 

A-l 

The error terms from the output and hidden layers are back 
propagated through the network by adjusting the weights of 
their respective layers. Weight adjustments, or delta 
weights, are calculated according to; 

Aw..{n) = vo^o^ + aAWj.(n - 1) 

where Aw^ ± is the change in the weight between node j in the 
hidden layer and node i in the input layer, tj is the learning 
rate, is the error term for observation p at node j of the 

hidden layer, o pi is the observed output for node i of the 
input layer for observation p, and a is the momentum. The 
terms n and n-1 refer to the present iteration and the 
previous iteration, respectively. The presentation of the 
entire set of p training observations is repeated when the 
number of iterations, n, exceeds p. A similar method is used 
to adjust the weights connecting the hidden layers of nodes 
to the next hidden layer, and between the final hidden layer 
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and the output layer. All weights are initially given random 
values prior to training. 

The neural net algorithms are used to associate membership 
5 functions with time profiles that define each amplification 
stage. After training, this procedure can. be used to predict 
the level of amplification with a high degree of accuracy. 
Comparison of predicted amplification levels with real-time 
monitoring could be used to optimise reactions further. 

10 . 

The methods described by Cobb and Clarkson (1994) provide a 
rapid method for evaluating the effects of various reaction 
components on the level of amplification obtained from a 
reaction. This can be used to train the neural network 

15 element of this process. The neural network takes key 
information . from the reaction to predict amplification 
levels- A five input network with two hidden layers using a 
5,5,3,1 format was used to predict amplification from 
annealing temperature/ template, primer, dNTP and Mg~" 

20 concentration data. 27 training reactions were performed 
using conditions described in Table 2 . 
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Table 2 : Reaction conditions amplification levels used to train 
a 4 layer neural network with 5 input nodes, two hidden layers 
with 5 and 3 nodes, and a single output layer. 



Reaction 


Temp (°C) 


Primer (mM) 


Template (ng) 


dNTPs (mM) 


Mg " + (mM) 


Amplification 


1 


45 


10 


50 


0.1 


1 


2 


2 


45 


10 


100 


0.2 


2 


2 


3 


45 


10 


150 


0 . 3 


3 


5 


4 


45 


20 


50 


0.2 


3 


5 


5 


45 


20 


100 


0.3 


1 


0 


6 


45 


20 


150 


0 . 1 


2 


0 


7 


45 


30 


50 


0.3 


2 


0 


8 


45 


30 


100 


0.1 


3 


0 


9 


45 


30 


150 


0 .2 


1 


0 


± u 


b U 


1 J 


5 0 


0 . 1 


1 


2 


11 


50 


10 


100 


0.2 


2 


3 


12 


50 


10 


150 


0.3 


3 


4 


13 


50 


20 


50 


0.2 


3 


1 


14 


50 


20 


100 


0.3 


1 


3 


. 15 


50 


20 . 


150 


0.1 


2 


4 


16 


50 


30 


50 


0.3 


2 


5 


17 


50 


30 


100 


0.1 


3 


4 


18 


5 0 


30 


150 


0.2 


1 


3 | 


19 


55 


10 


50 


0.1 


1 


0 


20 


55 


10 


100 


0.2 


2 


1 


21 


55 


10 


150 


0.3 


3 


5 


22 


55 


20 


50 


0.2 


3 


5 


23 


55 


20 


100 


0.3 


1 


0 


24 


5 5 


20 


150 


0.1 


2 


2 


25 


55 


30 


50 


0.3 


2 


0 


26 


55 


30 


100 


0.1 


3 


3 


27 


55 


30 


150 


0.2 


1 


0 
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The amount of amplification obtained from each of these 
reactions was scored based on 0 to 5 system. This data was 
used to train a neural network using backpropagation and 
sigmoidal _ transfer functions although tihis process is not 
limited to these transfer functions and learning algorithms. 
Figure 9 to 15 show the output from the neural network during 
training, and how each of the inputs relate to one another. 
Optimum temperature for the. reaction is clearly ca . 50°C from 
these data. Network information is described in Tables 3 to 
5. (N.B. working with this data genetic algorithms could be 
used to define reaction ^optima by testing a wide range of 
input signals and selecting for maximum outputs) . 



WO 99/18516 



PCT/GB98/02989 



24 



Table 3 : Network Training Information 



Network Name: 

Number of Layers : 

Input Layer: 
Nodes : 

Transfer Function : 
Hidden Layer 1 : 
Nodes : 

Transfer Function : 
Hidden Layer 2 : 
Nodes : 

Transfer Function: 
Output Layer : 
Nodes : 

Trans fer Funct ion : 

Connect ions : 

Training Information : 
Iterations : 
Training Error: 
Learn Rate: 
Momentum Factor: 
Fast - Prop Coef : 

Input node File: 
Input Start Column: 
Norma 1 i z e I npu t s : 

Output Node File: 
Output Start Column: 
Normalize Outputs: 

Training Patterns : 
Test Pattern: 



NO NAME 
4 



5 

Linear 



Sigmoid 
3 

Sigmoid 
1 

S igmoid 
FULL 

775 

0 . 002895 
0 . 111424 
0 .800000 
0 . 000000 

C:\Qnet97t\samples\PCR Input.txt 
YES 

C : \Qnet97t\samples\PCR Output . txt 
1 

YES 



27 
0 
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Table 4: Training Targets and Predicted Outputs for PCR 
Amplification Using a 4 Layered neural network. . 



Rsacc ion 




rreaict ea 


T 
X 


z . u u uuou 


2 . 008679 


2 


o n n n n n n 
. 0UUUU0 


1.998197 


j 


c a a a a a a 
b . 0 UU 0 U 0 


4.9 9 9532 


A 
*± 


r~ a f> a n r\ a 

b . 000000 


4 . 993 848 


C 


0 . 000000 


- 0 . 065793 


c 

D 


a a n r\ n r\ n 
U . 000000 


- 0 . 005967 


7 


A r\ a a a a a ■ 
0 . 000000 


0.0 07813 


3 


A A A A A A A 

0 . 000000 ! 


- 0 . 000233 


Q 
-7 


A A A A A A A 

0 . u 0 0 00 0 


0.048163 


A. U 


^ . 000000 


1 . 9 964 14 


11 


3.000000 


3.01 0950 


12 


4 . 000000 


3 . 986164 


13 


1 . 000000 


1 . 012590 


14 


3 . 000000 


2.998190 


15 


4 . 000000 


3 . 986575 


16 


5 . 000000 


4 . 986286 


17 


4 . 000000 


4 . 007507 ' 


18 


3 . 000000 


3 . 012231 


19 


0 .000000 


-0 .060089 


20 


1 .000000 


1 . 000848 


21 


5 . 000000 


5 . 024660 


22 


5 . 000000 


4.. 970664 


23 


0.000000 


0 . 000820 


24 


2 . 000000 . 


1 . 995387 


25 


0 . 000000 


0 . 066374 


26 


3 . 000000 


2.999384 


27 


0 . 000000 


-C . 000950 
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Table 5: Network Weights and Adjustment Deltas: 



Network .Weights and Current Adjustment Deltas 

Network Name : NO NAME 

Iterations: 10000 



Layer 


node 


Connection 


Weight 


Weight 
Delta 


2 


1 


1 


1.15091 


0 . 000014 


2 


l' 


2 


0.86859 


0.000007 


2 


1 


3 


1.72548. 


0 . 000003 


2 


1 


4 


-2 .79642 


0 . 000000 


2 


1 


5' 


-6 . 97007 


-0 . 000008 


2 


1 


1 


1 . 77795 


-0.000003 


2 


2 


2 


-0 . 03676 


-0 . 000008 


2 


2 


3 


4 . 06031 


-0 . 000007 


2 


2 


4 


-0 .75059 


-0.000002 


2 


2 


5 


-4 . 99932 


0 . 000007 


2 


3 


1 


-2 .30613 


-0 . 000104 


2. 


3 


2 


-2 .38492 


-0 . 000003 


2 


3 


3 


-0 . 72564 


-0 . 000005 


2 


3 


4 


0 . 35246 


0 .000009 


2 


3 


5 


4 . 50212 


0 .000009 


2 


4 


1 


-10 .99939 ■ 


0 . 000005 


2 


4 


2 


4.12186 


-0.000005 


2 


4 


3 


-0 . 27209 


0 . 000008 


2 


4 


4 


0 . 64145 


-0 . 000016 


2 


4 


5 


-C ^63613 


-0 . 000005 


2 


5 


1 


-7 .46985 


-0 . 000008 


2 


5 


2 


-1.74181 


0 . 000004 


2 


5 


3 


2 .85897 


-0 . 000001 


<d 


5 


4 


- 0 . 11206 


0 .000003 


2 


5 


5 


3 . 16325 


0 .000007 


3 


1 


1 


- 1 .70294 


-0 . 000002 


3 


1 


2 


2.71630 


-0 . 000002 


3 


1 


3 


-1,11906 


-0.000002 


3 


1 


4 


5 . 30498 


0 . 000015 


3 


1 


5 


- 3 .25660 


-0.000011 


3 


2 


1 


5 .48513 


0 . 000007 


3 


2 


2 


4.11580 


0.000003 


3 


2 


3 


-0 . 09909 


-0 . 000007 


3 


2 


4 


6.50361 


0 . 000002 


3 


2 


5 


-7.38153 


-0 . 000009 


3 


3 


1 


0 . 12320 


0 . 000000 


3 


3 


2 


1 .73109 


0.000007 


3 ' 


3 


3 


-4 .40772 


0 . 000000 


3 


3 


4 


-0 . 02973 


-0 . 000009 


3 


3 


5 


-1. 14767 


-0.000002 


4 


1 


1 


6 .41485 


0 . 000013 


4 


2_ 


2 


-7 .82154 


-0.000013 


4 


1 


3 


3 .38027 


-0 . 000002 
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Applications 

Currently, the major challenge for. PCR techniques is the 
development of intelligent ' ' instrumentation that can 
control the profile of the temperature cycles to optimise the 
level of amplification. Presently this requires a high 
technical involvement to define cycling conditions. Optima 
are found by repetitive trial and error experiments. This is 
expensive in terms of lab time, operator time, and in 
consumables. Optimal conditions are rarely found and, where 
sub-optimal conditions are used, subsequent interpretation 
and/or analysis of the amplification results is difficult. 

The process described here can be used to optimise thermal 
cycling conditions by intelligent control of the cycling 
conditions. It takes advantage of the fact that denaturation, 
annealing and extension are not limited to fixed temperatures 
in the cycle but extend over a range of temperatures in the 
profile. Modelling these events over the entire cycle allows 
optimisation of the. times allowed for each event. Misprinting, 
inactivation and depurination events are reduced as a 
consequence of the optimisation by decreasing exposure to 
conditions that promote these events. This method can be 
further, refined by taking into account the difference between 
block and sample temperature. There is a considerable lag 
between the two, which has a significant effect on the level 
of amplification. 

Details about the reaction mixture, concentration of Mg" , 
dNTPs, primer sequence, template data, etc., are used to 
define the membership functions. Reaction data and. predicted 
template size are used to. model the. reaction and determine 
the level of amplification. Genetic algorithms are used to 
optimise the time profile of the cycles.' ^This removes the 
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need for operator programming and provides a framework for 
program standardisation.- 

Combining online monitoring of amplification with this 
5 process can be used to dynamically maintain optimum 
amplification conditions in real-time. This has particular 
use in diagnostic applications where maintaining the 
integrity of the amplification is a critical factor for 
quality control. Comparison of expected and achieved 
10 amplification can be used to determine where reaction * 
performance is being compromised. This information can be 
used to either update cycling conditions in order to maintain 
optimum amplification, and/or warn the user about the 
problem. 

15 

Future challenges for NAA technologies include 

decentralisation from analysers in hospital laboratories that 
have high associated costs in terms of laboratory space and 
the employment of technical operators, to application in the 

20 field (e.g. by local health clinics). This has a number of 
problems associated with it. Adverse operating conditions, 
differences in sample preparation and operator error, will 
all alter amplification and subsequent data interpretation. 
The need for robust amplification conditions and quality 

25 control procedures that compensate for changes in operational 
environment will speed acceptability of PCR-based 
diagnostics. 

As a qualitative test, important performance characteristics 
30 need to be understood and defined for each application. 
Specimen type and clinical setting may affect interpretation 
of NAA tests. False-negative PCR results often occur due to 
the presence of inhibitors in the clinical .specimen, changes 
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in the operational environment or operator error. False- 
positive results may stem from contamination of reagents with 
target sequences. The sensitivity cut-off (e.g. sensitivity 
to gene copy number, annealing temperature, etc.), and the 
zone of poor reproducibility around this cut-off, can be 
defined in terms of PCR performance characteristics. These 
characteristics , along with the type of specimen and the 
clinical purpose of the test, provide a framework on which 
quality control can be considered. Our application provides 
the basis for maintaining optimal amplification, rapidly 
establishing the factors controlling the cut-off limits for 
particular amplifications . 

Reactions optimised by our process have inherently increased 
robustness. This process can therefore be used as the basis 
of- a standardised optimisation procedure. Robustness to 
variables such as the variation in the performance of the 
cycling instrument., etc., will enhance specificity and 
sensitivity. Additionally, a large amount of information can 
be extracted using this procedure providing the basis for 
stringent quality control of reaction performance. Vital 
information about the performance of the thermal cycler, 
variation in essential components of the reaction mixture, 
operating environment, and the level of nucleic acid material 
provided for the test, can be obtained from a limited number 
of reactions. Importantly, the preferred method provides a 
means for ""building- in ' 1 the ability to operate under 
different conditions. It also allows feedback-optimisation of 
subsequent assays by continually monitoring the performance 
of the assay procedure . 

The disclosure in British paten: ' application no. 
9720926.6, from which this application claims priority, and 
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in the abstract accompanying this application. are 
incorporated herein by reference. 
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CLAIMS 

1. A method of optimising the cycling conditions used to 
control a polymerase chain reaction by assigning membership 
values for denaturation, annealing and extension events in 
order to determine the relative contribution of each event 
during the- reaction, and using genetic algorithms to 
determine the optimum times required to complete each event. 

2. A method ccording to claim 1, wherein the relative 
contributions of denaturation, annealing and extension are 
calculated through assignment of membership values. 

3 . A method according to claim 2 , wherein membership values 
are used to determine the relative amounts of denaturation, 
annealing and extension at specific time points or over a 
series of time points . 

4. A method according to claim 1, wherein genetic 
algorithms are used to determine the optimum times for 
denaturation, annealing and extension. 

5. A method according to claim 1, wherein the process is 
used to standardise times used for PCR protocols or to 
optimise PCR protocols. 

6. A method according to claim 1, wherein the process is 
used to transfer prootocols from one thermal cycler to 
another, wherein the relative contributions of denaturation, 
annealing and extension are first calculated taking into 
account the thermal performance of the source cycler, and 
then transferred to the target cycler by caking into account 
the differences in cycler performance. 
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7. A method according to claim 1, wherein on-line 
monitoring is used to provide feed-back information about the 
performance of a reaction to allow fine adjustment of the 
calculated cycling times. 

8. A method according to any preceding claim, wherein a 
neural network is used to gain information about optimum 
cycling conditions. 

9 . A method according to any preceding claim, wherein a 
neural network is used to calculate, confirm or modify the 
calculated level of amplification. 

10 . A method in which a neural network is used to predict 
the level of amplification for a given set of reaction 
criteria. 

11. A method according to claim 11, wherein training inputs 
for the neural network are data from an orthogonal array, and 
outputs represent the level of amplification obtained from an 
associated PCR . 

12. A method according to claim 11, wherein an orthogonal 
array is used to reduce the number of training inputs for the 
neural network. 

13. A method according to any preceding claim, wherein the 
property determined is the sensitivity cut-off. 

14. A method according to any preceding claim, wherein the 
property determined is the zone of reproducibility around the 
cut-off. 
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15. A method according to any preceding claim, wherein the 
process is used as a quality control procedure by determining 
deviations from a predicted level of amplification. 

16. A system for optimising the cycling conditions used to 
control a polymerase chain reaction including processing 
means operable to assign membership values for denaturation, 
annealing and extension events, to determine the relative 
contribution of each event during the reaction, the 
processing means including genetic algorithm means operable 
to determine the optimum times required to complete each 
event . 
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Figure 2 
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Figure 4 
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Scatter Comparison of Normalized Targets vs. Net Outputs 
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